{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MNIST Image Classification with TensorFlow on Cloud ML Engine\n",
    "\n",
    "This notebook demonstrates how to implement different image models on MNIST using Estimator. \n",
    "\n",
    "Note the MODEL_TYPE and change it to try out different models."
   ]
  },
  {
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
 "!sudo chown -R jupyter:jupyter /home/jupyter/training-data-analyst"
   ]
  },
  {
     "cell_type": "code",
     "execution_count": 1,
     "metadata": {},
     "outputs": [],
     "source": [
      "from datetime import datetime\n",
      "import os\n",
      "\n",
      "PROJECT = \"your-project-id-here\"  # REPLACE WITH YOUR PROJECT ID\n",
      "BUCKET = \"your-bucket-id-here\"  # REPLACE WITH YOUR BUCKET NAME\n",
      "REGION = \"us-central1\"  # REPLACE WITH YOUR BUCKET REGION e.g. us-central1\n",
      "MODEL_TYPE = \"dnn\"  # \"linear\", \"dnn_dropout\", \"cnn\", or \"dnn\"\n",
      "\n",
      "# Do not change these\n",
      "os.environ[\"PROJECT\"] = PROJECT\n",
      "os.environ[\"BUCKET\"] = BUCKET\n",
      "os.environ[\"REGION\"] = REGION\n",
      "os.environ[\"MODEL_TYPE\"] = MODEL_TYPE\n",
      "os.environ[\"TFVERSION\"] = \"2.1\"  # Tensorflow  version\n",
      "os.environ[\"IMAGE_URI\"] = os.path.join(\"gcr.io\", PROJECT, \"mnistmodel\")"
     ]
    },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Updated property [core/project].\n",
      "Updated property [compute/region].\n"
     ]
    }
   ],
   "source": [
    "%%bash\n",
    "gcloud config set project $PROJECT\n",
    "gcloud config set compute/region $REGION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Building a dynamic model\n",
    "\n",
    "The boilerplate structure for this module has already been set up in the folder `mnistmodel`. The module lives in the sub-folder, `trainer`, and is designated as a python package with the empty `__init__.py` (`mnistmodel/trainer/__init__.py`) file. It still needs the model and a trainer to run it, so let's make them.\n",
    "\n",
    "Let's start with the trainer file first. This file parses command line arguments to feed into the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting mnistmodel/trainer/task.py\n"
     ]
    }
   ],
   "source": [
    "%%writefile mnistmodel/trainer/task.py\n",
    "import argparse\n",
    "import json\n",
    "import os\n",
    "import sys\n",
    "\n",
    "from . import model\n",
    "\n",
    "\n",
    "def _parse_arguments(argv):\n",
    "    \"\"\"Parses command-line arguments.\"\"\"\n",
    "    parser = argparse.ArgumentParser()\n",
    "    parser.add_argument(\n",
    "        '--model_type',\n",
    "        help='Which model type to use',\n",
    "        type=str, default='linear')\n",
    "    parser.add_argument(\n",
    "        '--epochs',\n",
    "        help='The number of epochs to train',\n",
    "        type=int, default=10)\n",
    "    parser.add_argument(\n",
    "        '--steps_per_epoch',\n",
    "        help='The number of steps per epoch to train',\n",
    "        type=int, default=100)\n",
    "    parser.add_argument(\n",
    "        '--job-dir',\n",
    "        help='Directory where to save the given model',\n",
    "        type=str, default='mnistmodel/')\n",
    "    return parser.parse_known_args(argv)\n",
    "\n",
    "\n",
    "def main():\n",
    "    \"\"\"Parses command line arguments and kicks off model training.\"\"\"\n",
    "    args = _parse_arguments(sys.argv[1:])[0]\n",
    "\n",
    "    # Configure path for hyperparameter tuning.\n",
    "    trial_id = json.loads(\n",
    "        os.environ.get('TF_CONFIG', '{}')).get('task', {}).get('trial', '')\n",
    "    output_path = args.job_dir if not trial_id else args.job_dir + '/'\n",
    "\n",
    "    model_layers = model.get_layers(args.model_type)\n",
    "    image_model = model.build_model(model_layers, args.job_dir)\n",
    "    model_history = model.train_and_evaluate(\n",
    "        image_model, args.epochs, args.steps_per_epoch, args.job_dir)\n",
    "\n",
    "\n",
    "if __name__ == '__main__':\n",
    "    main()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let's group non-model functions into a util file to keep the model file simple. We'll copy over the `scale` and `load_dataset` functions from the previous lab."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting mnistmodel/trainer/util.py\n"
     ]
    }
   ],
   "source": [
    "%%writefile mnistmodel/trainer/util.py\n",
    "import tensorflow as tf\n",
    "\n",
    "\n",
    "def scale(image, label):\n",
    "    \"\"\"Scales images from a 0-255 int range to a 0-1 float range\"\"\"\n",
    "    image = tf.cast(image, tf.float32)\n",
    "    image /= 255\n",
    "    image = tf.expand_dims(image, -1)\n",
    "    return image, label\n",
    "\n",
    "\n",
    "def load_dataset(\n",
    "        data, training=True, buffer_size=5000, batch_size=100, nclasses=10):\n",
    "    \"\"\"Loads MNIST dataset into a tf.data.Dataset\"\"\"\n",
    "    (x_train, y_train), (x_test, y_test) = data\n",
    "    x = x_train if training else x_test\n",
    "    y = y_train if training else y_test\n",
    "    # One-hot encode the classes\n",
    "    y = tf.keras.utils.to_categorical(y, nclasses)\n",
    "    dataset = tf.data.Dataset.from_tensor_slices((x, y))\n",
    "    dataset = dataset.map(scale).batch(batch_size)\n",
    "    if training:\n",
    "        dataset = dataset.shuffle(buffer_size).repeat()\n",
    "    return dataset\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, let's code the models! The [tf.keras API](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras) accepts an array of [layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers) into a [model object](https://www.tensorflow.org/api_docs/python/tf/keras/Model), so we can create a dictionary of layers based on the different model types we want to use. The below file has two functions: `get_layers` and `create_and_train_model`. We will build the structure of our model in `get_layers`. Last but not least, we'll copy over the training code from the previous lab into `train_and_evaluate`.\n",
    "\n",
    "These models progressively build on each other. Look at the imported `tensorflow.keras.layers` modules and the default values for the variables defined in `get_layers` for guidance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting mnistmodel/trainer/model.py\n"
     ]
    }
   ],
   "source": [
    "%%writefile mnistmodel/trainer/model.py\n",
    "import os\n",
    "import shutil\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "from tensorflow.keras import Sequential\n",
    "from tensorflow.keras.callbacks import TensorBoard\n",
    "from tensorflow.keras.layers import (\n",
    "    Conv2D, Dense, Dropout, Flatten, MaxPooling2D, Softmax)\n",
    "\n",
    "from . import util\n",
    "\n",
    "\n",
    "# Image Variables\n",
    "WIDTH = 28\n",
    "HEIGHT = 28\n",
    "\n",
    "\n",
    "def get_layers(\n",
    "        model_type,\n",
    "        nclasses=10,\n",
    "        hidden_layer_1_neurons=400,\n",
    "        hidden_layer_2_neurons=100,\n",
    "        dropout_rate=0.25,\n",
    "        num_filters_1=64,\n",
    "        kernel_size_1=3,\n",
    "        pooling_size_1=2,\n",
    "        num_filters_2=32,\n",
    "        kernel_size_2=3,\n",
    "        pooling_size_2=2):\n",
    "    \"\"\"Constructs layers for a keras model based on a dict of model types.\"\"\"\n",
    "    model_layers = {\n",
    "        'linear': [\n",
    "            Flatten(),\n",
    "            Dense(nclasses),\n",
    "            Softmax()\n",
    "        ],\n",
    "        'dnn': [\n",
    "            Flatten(),\n",
    "            Dense(hidden_layer_1_neurons, activation='relu'),\n",
    "            Dense(hidden_layer_2_neurons, activation='relu'),\n",
    "            Dense(nclasses),\n",
    "            Softmax()\n",
    "        ],\n",
    "        'dnn_dropout': [\n",
    "            Flatten(),\n",
    "            Dense(hidden_layer_1_neurons, activation='relu'),\n",
    "            Dense(hidden_layer_2_neurons, activation='relu'),\n",
    "            Dropout(dropout_rate),\n",
    "            Dense(nclasses),\n",
    "            Softmax()\n",
    "        ],\n",
    "        'dnn': [\n",
    "            Conv2D(num_filters_1, kernel_size=kernel_size_1,\n",
    "                   activation='relu', input_shape=(WIDTH, HEIGHT, 1)),\n",
    "            MaxPooling2D(pooling_size_1),\n",
    "            Conv2D(num_filters_2, kernel_size=kernel_size_2,\n",
    "                   activation='relu'),\n",
    "            MaxPooling2D(pooling_size_2),\n",
    "            Flatten(),\n",
    "            Dense(hidden_layer_1_neurons, activation='relu'),\n",
    "            Dense(hidden_layer_2_neurons, activation='relu'),\n",
    "            Dropout(dropout_rate),\n",
    "            Dense(nclasses),\n",
    "            Softmax()\n",
    "        ]\n",
    "    }\n",
    "    return model_layers[model_type]\n",
    "\n",
    "\n",
    "def build_model(layers, output_dir):\n",
    "    \"\"\"Compiles keras model for image classification.\"\"\"\n",
    "    model = Sequential(layers)\n",
    "    model.compile(optimizer='adam',\n",
    "                  loss='categorical_crossentropy',\n",
    "                  metrics=['accuracy'])\n",
    "    return model\n",
    "\n",
    "\n",
    "def train_and_evaluate(model, num_epochs, steps_per_epoch, output_dir):\n",
    "    \"\"\"Compiles keras model and loads data into it for training.\"\"\"\n",
    "    mnist = tf.keras.datasets.mnist.load_data()\n",
    "    train_data = util.load_dataset(mnist)\n",
    "    validation_data = util.load_dataset(mnist, training=False)\n",
    "\n",
    "    callbacks = []\n",
    "    if output_dir:\n",
    "        tensorboard_callback = TensorBoard(log_dir=output_dir)\n",
    "        callbacks = [tensorboard_callback]\n",
    "\n",
    "    history = model.fit(\n",
    "        train_data,\n",
    "        validation_data=validation_data,\n",
    "        epochs=num_epochs,\n",
    "        steps_per_epoch=steps_per_epoch,\n",
    "        verbose=2,\n",
    "        callbacks=callbacks)\n",
    "\n",
    "    if output_dir:\n",
    "        export_path = os.path.join(output_dir, 'keras_export')\n",
    "        model.save(export_path, save_format='tf')\n",
    "\n",
    "    return history\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Local Training\n",
    "\n",
    "Now that we know that our models are working as expected, let's run it on the [Google Cloud AI Platform](https://cloud.google.com/ml-engine/docs/). We can run it as a python module locally first using the command line.\n",
    "\n",
    "The below cell transfers some of our variables to the command line as well as create a job directory including a timestamp.\n",
    "\n",
    "You can change the model_type to try out different models."
  ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "current_time = datetime.now().strftime(\"%y%m%d_%H%M%S\")\n",
    "model_type = 'dnn'\n",
    "\n",
    "os.environ[\"MODEL_TYPE\"] = model_type\n",
    "os.environ[\"JOB_DIR\"] = \"gs://{}/mnist_{}_{}/\".format(\n",
    "    BUCKET, model_type, current_time)\n",
    "os.environ[\"JOB_NAME\"] = \"mnist_{}_{}\".format(\n",
    "    model_type, current_time)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The cell below runs the local version of the code. The epochs and steps_per_epoch flag can be changed to run for longer or shorther, as defined in our `mnistmodel/trainer/task.py` file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "python3 -m mnistmodel.trainer.task \\\n",
    "    --job-dir=$JOB_DIR \\\n",
    "    --epochs=5 \\\n",
    "    --steps_per_epoch=50 \\\n",
    "    --model_type=$MODEL_TYPE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training on the cloud\n",
    "\n",
    "Since we're using an unreleased version of TensorFlow on AI Platform, we can instead use a [Deep Learning Container](https://cloud.google.com/ai-platform/deep-learning-containers/docs/overview) in order to take advantage of libraries and applications not normally packaged with AI Platform. Below is a simple [Dockerlife](https://docs.docker.com/engine/reference/builder/) which copies our code to be used in a TF2 environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting mnistmodel/Dockerfile\n"
     ]
    }
   ],
   "source": [
    "%%writefile mnistmodel/Dockerfile\n",
    "FROM gcr.io/deeplearning-platform-release/tf2-cpu\n",
    "COPY mnistmodel/trainer /mnistmodel/trainer\n",
    "ENTRYPOINT [\"python3\", \"-m\", \"mnistmodel.trainer.task\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The below command builds the image and ships it off to Google Cloud so it can be used for AI Platform. When built, it will show up [here](http://console.cloud.google.com/gcr) with the name `mnistmodel`. ([Click here](https://console.cloud.google.com/cloud-build) to enable Cloud Build)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "!docker build -f mnistmodel/Dockerfile -t $IMAGE_URI ./"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "!docker push $IMAGE_URI"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we can kickoff the [AI Platform training job](https://cloud.google.com/sdk/gcloud/reference/ai-platform/jobs/submit/training). We can pass in our docker image using the `master-image-uri` flag."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "echo $JOB_DIR $REGION $JOB_NAME\n",
    "gcloud ai-platform jobs submit training $JOB_NAME \\\n",
    "    --staging-bucket=gs://$BUCKET \\\n",
    "    --region=$REGION \\\n",
    "    --master-image-uri=$IMAGE_URI \\\n",
    "    --scale-tier=BASIC_GPU \\\n",
    "    --job-dir=$JOB_DIR \\\n",
    "    -- \\\n",
    "    --model_type=$MODEL_TYPE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Can't wait to see the results? Run the code below and copy the output into the [Google Cloud Shell](https://console.cloud.google.com/home/dashboard?cloudshell=true) to follow."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Deploying and predicting with model\n",
    "\n",
    "Once you have a model you're proud of, let's deploy it! All we need to do is give AI Platform the location of the model. Below uses the keras export path of the previous job, but `${JOB_DIR}keras_export/` can always be changed to a different path.\n",
    "\n",
    "Uncomment the delete commands below if you are getting an \"already exists error\" and want to deploy a new model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "MODEL_NAME=\"mnist\"\n",
    "MODEL_VERSION=${MODEL_TYPE}\n",
    "MODEL_LOCATION=${JOB_DIR}keras_export/\n",
    "echo \"Deleting and deploying $MODEL_NAME $MODEL_VERSION from $MODEL_LOCATION ... this will take a few minutes\"\n",
    "#yes | gcloud ai-platform versions delete ${MODEL_VERSION} --model ${MODEL_NAME}\n",
    "#yes | gcloud ai-platform models delete ${MODEL_NAME}\n",
    "gcloud ai-platform models create ${MODEL_NAME} --regions $REGION\n",
    "gcloud ai-platform versions create ${MODEL_VERSION} \\\n",
    "    --model ${MODEL_NAME} \\\n",
    "    --origin ${MODEL_LOCATION} \\\n",
    "    --framework tensorflow \\\n",
    "    --runtime-version=2.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright 2020 Google Inc.\n",
    "Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at\n",
    "http://www.apache.org/licenses/LICENSE-2.0\n",
    "Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
